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Abstract 

This paper investigates the influence of three factors on pronunciation accuracy of Chinese adult foreign language 
learners. Ten target sounds including phonemes and syllables are included in the pre-test, an analysis of which 
shows that the mispronunciation of the randomly chosen target sounds mainly results from LI negative transfer. It is 
observed in intervention that some mispronounced target sounds are more difficult to be corrected than the others. 
However, the hierarchy of difficulty for pronunciation acquisition cannot be constructed without considering the 
impact of task variables, for even the same subject’s performance might vary in post-test including vocabulary 
reading, sentence reading and spontaneous speech, in which tasks individual aptitude in perception, mimicry and 
monitoring also has a role to play in the improvement of pronunciation accuracy. 

Keywords: LI negative transfer. Hierarchy of difficulty. Task, Aptitude 

1. Introduction 

It is widely observed that foreign language learners might make different types of errors in their communication. 
Researchers have sought to identify the sources of interlanguge (IL) pronunciation errors, which might contribute 
greatly to the improvement of pronunciation (Stockman & Pluut, 1992). Contrastive analysis is an early attempt in 
this aspect. The idea behind this approach is that a comparison of the learners’ native language (NL) with the target 
language (TL) would allow difficulties to be predicted (Dalton, 1994). Flege & Davidian (1984) examined the 
Engligh pronunciation of some adult Polish, Spanish, and Chinese speakers, suggesting that both developmental 
processes and transfer processes might influence adult L2 speech production. After a study of the minimal segments 
in L2 phonology, Weinberger (1997) claims that the phenomenon of negative transfer extends beyond the level of 
individual phonemes to include syllable structures. Eckman (as cited in Celce-Murcia, Brinton, & Goodwin, 1996) 
asserts that contrastive analysis alone is not enough and proposes to remedy the deficiency of contrastive analysis by 
constructing a hierarchy of difficulty for phonological acquisition; the hierarchy might predict not only which 
sounds learners would have difficulty with, but which problems would be more difficult for a linguistically 
homogeneous group of learners. 

In addition, Thompson (1991) observed the performance of thirty-six native speakers of Russian in sentence reading, 
prose reading and spontaneous speech and concluded that the tasks in which the participants were involved would 
have an effect on their performance in pronunciation accuracy, while Jilkal et al (2007) and Haslam (2010) claim 
that that individual aptitude is also a decisive factor in learning a foreign language sound system. Liu and Fu (2011) 
conducted an empirical study and affirmed the combined effect of instruction and monitor in improving 
pronunciation accuracy of Chinese foreign language learners, however, some issues should be further clarified such 
as the source of their pronunciation errors and the potential reasons that result in the subjects’ different degrees of 
improvement after intervention. In view of the tentative results of above researches, we intent to take a scrutiny at 
the experiment again to verify the following hypotheses: (1) Negative transfer (from the level of phoneme to 
syllable structure) accounts for the majority of pronunciation errors and there exists a hierarchy of difficulty in 
pronunciation acquisition for Chinese foreign language learners. (2) Task variables and individual aptitude are also 
two important factors that influence the improvement of pronunciation accuracy. Due to the limited empirical 
studies conducted in the field of pronunciation of Chinese adult English learners, it is hoped that the present study 
might offer more insights into factors that influence their IL pronunciation accuracy. 
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2. Experiment 

This section only gives a brief introduction to the frame of the experiment; as to the detailed information of its 
implementation (such as the choice of the participants and target sounds, the organization of the pre-test, 
intervention and post-test etc.), please refer to Liu and Fu (2011), for the present study is a supplementary discussion 
on the unsettled issues of their research “The Combined Effect of Instruction and Monitor in Improving 
Pronunciation of Potential English Teachers” published on the September issue of “English Language Teaching”, 
while the relevant tables are listed again at the end of this paper for convenience of reference. 

In short, the experiment consisted of three parts, i.e., pre-test, intervention for the experimental group and post-test. 
In pre-test (Tl), altogether 10 words were presented, each containing an underlined target sound (including 
phonemes and syllable structures) often mispronounced (Insert Table 1 here). In the phase of intervention, 
participants from experimental group participated in instruction on the ten target sounds and training of monitor 
strategies. Post-test included vocabulary reading (T2), sentence reading (T3) and simultaneous speech (T4). In T2 
and T3, altogether 20 words containing the target sounds were presented, 10 of them (including a letter) appeared in 
isolation in T2, while the others were incorporated into 4 sentences in T3 (Insert table 2 here). In T4, the subjects 
from experimental group were required to present for 3 to 5 minutes on one of the five topics (Insert table 3 here) so 
as to elicit a speech sample containing as many target sounds as possible. Their performance in the four tests was 
summarized in table 4, where “N” indicates the number of subjects producing each target sound in T4, while 
“Percentage of N” shows how many of N have properly pronounced a specific sound in each test (Insert table 4 
here). 

3. Analysis and Implication 

3.1 Negative transference and hierarchy of difficulty 

As it is mentioned in Liu and Fu (2011), the ten target sounds were chosen randomly from teaching practices 
because of the high frequency of mispronunciation. It is obvious from visual inspection of table 4 that each target 
sound was accurately pronounced by different percentage of participants in T1 and each was improved to different 
degrees after the intervention. With reference to the sources of their mispronunciation, the difficulty that each 
specific sound presents to the participants and the factors that influence their improvement, insights might be gained 
from an analysis of the participants’ performances on each target sound in the tests, their reactions observed during 
the instruction and the feedbacks from the two scorers. 

/v/ and /0/. These two phonemes do not cause much difficulty to the participants in spite of their non-existence in 
Mandarin. However, some participants still mispronounced them, and their mistakes can be attributed to the 
similarity of these two sounds with Chinese [w] and [s]. Fortunately, it was easy for them to identify these two target 
sounds and pronounce them accurately in T2 and T3, owing to the intervention received. Nevertheless, it was 
relatively difficult, without the visual help, to monitor the pronunciation of these two sounds appearing in their 
spontaneous speech. Accordingly, their performances deteriorated slightly in T4, where /v/ appearing in such words 
as “develop”, “however”, and “advice” was still mispronounced as [w]. Similarly, the phoneme /0/ contained in 
words like “health”, “three”, and “birth” was mispronounced as [s]. 

/as/. Though only 3% of the participants could pronounce this vowel accurately in Tl, it was observed during the 
instruction that this sound was among those that could be easily corrected and the subjects could produce it correctly 
in T2. However, the subjects’ performances on this sound were not quite satisfying in T4, for this phoneme was 
often replaced by [e], a sound that is close to Chinese [ai], and thus “happiness” was mispronounced by many 
participants. 

/ /. Only half of the participants produced this phoneme in T4, and 40% of these 15 subjects produced this sound 
correctly in Tl. Though the percentage, compared with the others, was not very low, it caused much difficulty in 
correction for many subjects. Many words such as “usual” and “division” were mispronounced in T4, with / / 
replaced by sound like [r] in Mandarin. 

/si:/. As was predicted, many participants substituted [se ] for /si:/ in Tl, and it might result from the non-existence 
of the syllable /si:/ in Chinese, while [se ] has similarity with a sound in the local dialect of Linyi city, which is 
located at the southeast of Shandong province, China. Fortunately, most of the subjects were capable of pronouncing 
it properly once the substitution error was pointed out to them. Finally, significant progress was observed in the 
post-test. 

/ra /. The diphthong /a / was often mistaken for Chinese [ou] and the combination of /r/ and /a / was often 
mispronounced as [rou] in Mandarin. Nearly half of the participants made this mistake in Tl. Having been used to 
the pronunciation of [rou], some participants experienced some difficulties in this target sound and could not 
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pronounce words such as “role” and “froze” properly in T4. 

/w n/. This sound was unexceptionally mispronounced as [wen] in Mandarin by all the 30 participants in Tl. In 
addition, the correction of this target sound turned out to be rather difficult. Altogether three participants produced 
words containing this sound in T4, and only one of them succeeded in its accurate pronunciation. 

/ n/. This syllable was usually mispronounced as [ang] in Mandarin. Therefore, it was not surprising to note that 
most participants pronounced the word “sun” as [sang] in T1. Though this mistake was quickly corrected by most 
subjects during the instruction, several learners mispronounced “fun” as something like [fa n] in T3, which might 
be attributed to their inferior ability in perception and mimicry observed in intervention. It was also observed that 
/ n/ could be mispronounced as Chinese [an] when preceded by /w/, and that is why “one” was produced as [wan] 
by some participants in T4. 

/ ij/. In spite of the difference between this target sound and the one above, both were mispronounced as [ang] by a 
majority of the subjects. It seemed that this error could be corrected easily once the learners realized where the 
problem existed, for most of them produced the words “song” in T2 and “strong” in T3 accurately. However, this 
mistake did reappear in the speech of some students in T4, with [b long] produced for the word “belong”, for 
instance. 

/a n/. It seemed that this combination presented the greatest difficulty among all the 10 target sounds. All of the 30 
subjects in the experimental group substituted [dang] for /da n/ in Tl. It was observed during the instruction that 
many subjects could not produce the diphthong /a / properly, not to mention its combination with a consonant. In 
T4, only 2 of the 7 learners succeeded in the pronunciation of this sound, whereas the others still pronounced such 
words as “found” and “amount” in a wrong way. 

The analysis above provides clear evidence to our first hypothesis. That is to say, negative transfer of NL is a 
predominant force shaping IL pronunciation and this also explains why these subjects exhibited similar substitution 
habits. In addition, there does exist a hierarchy of difficulty as far as the ten target sounds are concerned, for some 
mispronounced sounds such as /v/ and /0/ are much easier for the subjects to correct after intervention than / /, 
/w n/ etc. However, the hierarchy should be dynamic and different hierarchy structures might be formed with the 
change of task types. Therefore, it is very difficult to sequence the target sounds exactly due to the interference of 
task variables. 

3.2 The effect of task variables on pronunciation accuracy 

Table 4 shows that the most significant progress is achieved in T2 by the participants in the experimental group, 
whose performances tended to deteriorate gradually in T3 and T4. In T2, the students only needed to direct their 
attention to the individual words tested in isolation. Therefore, enough time was guaranteed for each participant to 
monitor their production. In addition, with the learners’ concerns for correctness and the help of pronunciation rules 
learned explicitly, it was not surprising that this task produced the most encouraging results. 

Thompson (1991) observed in his research that “materials artificially constructed with many difficult sounds might 
exceed the monitoring ability of the experienced SL speakers and might result in greater perceived accentedness 
than was the case of spontaneous speech”. Nevertheless, his argument was partially refuted by the results of this 
research, in which the participants generally achieved greater progress in T3 than in T4. Though T3 resulted in better 
performances than T4, it did, as Thompson (1991) argued, place excessive demands on the ability of speakers to 
monitor their pronunciation. In order to identify the words that contain the target sounds and pronounce them 
correctly according to the rules learned, it took the participants of the experimental group more time to finish the 
four specifically constructed sentences compared with their peers in the control group, as was observed during the 
recording. Consequently, an over-careful style was generally produced in T3 in spite of the higher scores obtained. 
In T4, attention should be divided among lexical access, syntactic well-formedness, discourse organisation etc., so 
the participants were not always capable of monitoring their pronunciation accuracy, and thus their scores tended to 
decrease compared with T2 and T3. Therefore, the first part of the second hypothesis is verified, that is, task 
variables have their role to play in the improvement of pronunciation accuracy. 

With reference to the difference in the findings between Thompson’s research and this present one, a mitigating 
consideration lies in the rating method applied in T4. It should be remembered that each target sound was judged to 
be accurate only when it was properly pronounced in all the words containing it. Therefore, a less strict rating 
method might result in possible improvement in the scores of T4 and thus the findings of Thompson might be 
justified. 

3.3 The effect of individual aptitude on pronunciation accuracy 

The experiment shows that individual aptitude is another key factor that influences the improvement of 
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pronunciation accuracy. During the instruction, it was noted that some subjects could easily perceive the difference 
between the acceptable and unacceptable pronunciation of a particular sound, and then imitate the accurate one in a 
short time. Nevertheless, it seemed rather difficult for the others to differentiate between them and to produce the 
desired sound within the same time period. Therefore, aptitude in perception and mimicry is a quality which some 
fortunate individuals possess to a greater degree than others. In addition, monitoring ability is also an important 
aptitude that greatly influences pronunciation improvement. 

Krashen & Terrell (1988) argue that the level of ability to monitor successfully varies from student to student and 
three types of learners might be differentiated: the under-users, the over-users and the optimal users. According to 
their explanation, monitor under-users are performers who do not seem to use the monitor to any extent, even when 
conditions encourage it. Monitor over-users are constantly checking their output with their learned conscious 
knowledge. As a result, they are so concerned with correctness that they have difficulty speaking with any real 
fluency. The optimal monitor users use the monitor when it is appropriate, and when it does not get in the way of 
communication. In our experiment, the subjects were expected to produce all the ten target sounds in a relatively 
limited time in T3. Therefore, the performances of the experimental group in T3 were taken as an example to 
illustrate the three types of monitor users. 

The under-users, as defined by Krashen & Terrell (1988), either could not or made no effort to use the pronunciation 
knowledge which had been taught. These students generally had to rely completely on their prior pronunciation 
habits when reading the sentences. Subject 27 was observed to be a typical under-user in this study. She finished her 
reading much more quickly than her peers, paying no attention to identifying the words containing the target sounds 
from the sentences. An examination of the results indicated that this learner scored 5, which was the lowest in T3. 

Most of the subjects could be categorized into monitor over-users in this test. Their over-reliance on the learned 
rules often resulted in exaggerated and unnatural accent and interfered with their fluency in sentence reading. 
Subject 20 could be labelled as an extreme monitor over-user, who exhibited her dissatisfaction with her own 
performance in the pre-test. During the instruction, she was perceived to be an inferior imitator in spite of her 
apparent efforts. Compared with her peers, she spent much more time to finish the sentence reading and was 
recorded to self-correct twice in T3. Nevertheless, this student achieved great progress, with a score of 8 in contrast 
to that of zero in Tl. 

Several participants distinguished themselves to be the optimal users in T3. These prominent performers, like the 
over-users, were also concerned with applying conscious rules in sentence reading. Nevertheless, they succeeded in 
monitoring their output within the relatively limited time, leaving no impression of unnatural accent and signs of 
hesitation. Subject 3 could be selected as an outstanding representative of this type. In spite of her inferior score of 3 
in Tl, this participant managed to finish her reading in the normal accent and speed, with all the target sounds 
produced accurately. Due to her great progress achieved in T3, special attention was paid to her performance in T4. 
As a result, the recording showed no overcarefi.il style in her speech. And an examination of her test score indicated 
that of the seven target sounds appearing in her presentation, five were accurately pronounced, whereas only two of 
them were properly produced in Tl. Therefore, the prominent performance of subject 3 demonstrated that monitor 
might be successfully applied by optimal monitor users even in speaking. Analysis above suggests that language 
aptitude has an important influence on the learners’ performance in pronunciation, and thus the second part of our 
second hypothesis is also justified. 

Fortunately, language aptitude is not an all-or-nothing situation and may be trainable, given the most effective 
approach (Kenworthy, 1987). Therefore, the superior performances of optimal monitor users lie in their daily 
practices, which help them to internalize the pronunciation rules learned and modify their deviations. Then these 
modified sounds can be integrated into their speech with ease, while the focus is still on the conveyance of messages. 
Consequently, both under-users and over-users should, through persevering practices, manage to develop into 
optimal users capable of polishing their sounds in speaking without threatening their communication. 

4. Conclusion 

In short, both of the two hypotheses are justified by a scrutiny at the experiment again. First of all, negative transfer 
constitutes the basis for the mispronunciation of language learners, which can be explained within an 
information-processing model as “a psycholinguistic procedure by means of which SL learners activate their NL 
knowledge in developing or using their IL (Hammarberg, 1997). Secondly, there exists the hierarchy of difficulty for 
pronunciation acquisition. However, it cannot be constructed easily without considering the impact of task variables, 
for different hierarchies of difficulty might be formed in different tasks even for the same target sounds. That is 
because foreign language learners, without practice of aptitude in perception, mimicry and monitoring, might slide 
back to their habitual pronunciation in spontaneous speech even for some easily corrected mispronunciations in 
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isolation. The statement above verified by our experiment testifies the second hypothesis, i.e., the impact of task 
variables and individual aptitude on the improvement of pronunciation accuracy. Therefore, language learners 
should learn to differentiate various types of negative transfer, improve their aptitude and develop into optimal 
monitor users through unremitting practice with the view of successive approximation to the TL pronunciation 
system. 

In view of certain limitations of this research, the future investigations might take a number of factors into 
consideration. In the first place, the ten target sounds randomly chosen might not be representative and larger 
number of sounds might be included in future study to ascertain the other possible sources of mispronunciation. The 
second consideration concerns how to construct a hierarchy system incorporating different task types for the sounds 
that cause difficulty to foreign language learners. The effective methods might also be explored in further studies to 
improve learners’ aptitude in pronunciation. 
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Table 1. Ten vocabulary items in the pre-test (Tl) 


Word 

Acceptable 

pronunciation 

Unacceptable 

pronunciation 

Word 

Acceptable 

pronunciation 

Unacceptable 

pronunciation 

very 

/ veri/ 

[ weri] 

road 

/re d I 

[roud] 

thing 

/0 o/ 

[s til 

window 

/ w nds / 

[ wends ] 

had 

/haed/ 

[hed] 

sun 

Is n/ 

[sang] 

pleasure 

/ ple_s/ 

[ piers] 

long 

/I_iy/ 

[long] 

see 

/sk/ 

[se ] 

down 

/da n/ 

[dang] 
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Table 2. Words tested in vocabulary and sentence reading 


Words tested in vocabulary reading 

Words tested in sentence reading 

five 

row 

live 

tomorrow 

faith 

wind 

think 

windy 

mass 

run 

bad 

fun 

measure 

song 

treasure 

strong 

c 

town 

sea 

downstairs 

Sentences 

(1) 

It’s windy tomorrow. 

(2) 

I think there are lots of treasures in the sea. 

(3) 

He feels a strong desire to play downstairs. 

(4) 

It’s not bad to live in this city, where life is full of fun. 


Table 3. Five Topics in Spontaneous Speech (T4) 


1 

On Happiness. 

2 

Sexual Discrimination in China. 

3 

A Typical Chinese Festival. 

4 

The Necessity of Environmental Protection. 

5 

Elaborate on anything which you are interested in. 


Table 4. Performances of N in Tl, T2, T3, and T4 


Target sound 

N 

Percentage of N 

Tl 

T2 

T3 

T4 

1 

Nt 

30 

.63 

1.00 

.87 

.77 

2 

70/ 

30 

.57 

1.00 

.80 

.70 

3 

/ae/ 

30 

.03 

.93 

.77 

.33 

4 

/ / 

15 

.40 

.67 

.53 

.47 

5 

/si:/ 

11 

.36 

.91 

.82 

.73 

6 

Its / 

7 

.57 

.86 

.71 

.71 

7 

/w n/ 

3 

.00 

.67 

.67 

.33 

8 

/ n/ 

20 

.05 

.90 

.80 

.65 

9 

/ o/ 

11 

.09 

.82 

.82 

.64 

10 

/a n/ 

7 

.00 

.57 

.43 

.29 
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